A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics.

نویسندگان

  • Juliane Schäfer
  • Korbinian Strimmer
چکیده

Inferring large-scale covariance matrices from sparse genomic data is an ubiquitous problem in bioinformatics. Clearly, the widely used standard covariance and correlation estimators are ill-suited for this purpose. As statistically efficient and computationally fast alternative we propose a novel shrinkage covariance estimator that exploits the Ledoit-Wolf (2003) lemma for analytic calculation of the optimal shrinkage intensity. Subsequently, we apply this improved covariance estimator (which has guaranteed minimum mean squared error, is well-conditioned, and is always positive definite even for small sample sizes) to the problem of inferring large-scale gene association networks. We show that it performs very favorably compared to competing approaches both in simulations as well as in application to real expression data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Shrinkage Estimation of the Power Spectrum Covariance Matrix

We introduce a novel statistical technique, shrinkage estimation, to estimate the power spectrum covariance matrix from a limited number of simulations. We optimally combine an empirical estimate of the covariance with a model (the target) to minimize the total mean squared error compared to the true underlying covariance. We test our technique on N-body simulations and evaluate its performance...

متن کامل

An Introduction to Shrinkage Estimation of the Covariance Matrix: A Pedagogic Illustration

Shrinkage estimation of the covariance matrix of asset returns was introduced to the finance profession several years ago. Since then, the approach has also received considerable attention in various life science studies, as a remedial measure for covariance matrix estimation with insufficient observations of the underlying variables. The approach is about taking a weighted average of the sampl...

متن کامل

Non-linear shrinkage estimation of large-scale structure covariance

In many astrophysical settings, covariance matrices of large data sets have to be determined empirically from a finite number of mock realizations. The resulting noise degrades inference and precludes it completely if there are fewer realizations than data points. This work applies a recently proposed non-linear shrinkage estimator of covariance to a realistic example from large-scale structure...

متن کامل

Nonlinear Shrinkage Estimation of Large-dimensional Covariance Matrices by Olivier Ledoit

Many statistical applications require an estimate of a covariance matrix and/or its inverse. When the matrix dimension is large compared to the sample size, which happens frequently, the sample covariance matrix is known to perform poorly and may suffer from ill-conditioning. There already exists an extensive literature concerning improved estimators in such situations. In the absence of furthe...

متن کامل

Nonlinear shrinkage estimation of large-dimensional covariance matrices

Many statistical applications require an estimate of a covariance matrix and/or its inverse. Whenthe matrix dimension is large compared to the sample size, which happens frequently, the samplecovariance matrix is known to perform poorly and may suffer from ill-conditioning. There alreadyexists an extensive literature concerning improved estimators in such situations. In the absence offurther kn...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Statistical applications in genetics and molecular biology

دوره 4  شماره 

صفحات  -

تاریخ انتشار 2005